library(pacman)
pacman::p_load(sf) # install the package if you haven't already
packageVersion('sf')[1] '1.0.16'
Practical - Week 2
2024-10-07
How to plot biodiversity data, explore patterns at different resolutions, and make pretty maps.
Exercises:
Please download the following files and store them on you data folder:
trees.gpkgmammalsCZ.rdsCZE_adm0.gpkgKvadratyCR_JTSK.gpkgFor geocomputation today we will use the sf package.
For downloading country-specific and world polygons we will use the rnaturalearth package.
For all the map plots today we will use the ggplot package. It’s part of the tidyverse.
Photo of Quercus petraea observed in Portugal by Luke Lythgoe licensed CC-BY, via iNaturalist
Steps:
POLYGONS)POLYGONS)We will get the data from:
Caudullo G., Welk E., San-Miguel-Ayanz J., 2017. Chorological maps for the main European woody species. Data in Brief 12, 662-666. DOI: 10.1016/j.dib.2017.05.007
We will only use the polygon data. See read_tree_data_to_gpkg.R to know how these data were processed.
What do the data look like?
Simple feature collection with 81 features and 1 field
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -180 ymin: 20.71311 xmax: 180 ymax: 72.64833
Geodetic CRS: WGS 84
First 10 features:
layer_name geom
1 Abies_alba MULTIPOLYGON (((16.23493 40...
2 Abies_borisiiregis MULTIPOLYGON (((21.87161 38...
3 Abies_cephalonica MULTIPOLYGON (((22.34088 37...
4 Abies_cilicica MULTIPOLYGON (((36.3169 34....
5 Abies_nebrodensis MULTIPOLYGON (((14.02312 37...
6 Abies_nordmanniana MULTIPOLYGON (((27.08184 39...
7 Abies_numidica MULTIPOLYGON (((5.44332 36....
8 Abies_pinsapo MULTIPOLYGON (((-5.093451 3...
9 Acer_campestre MULTIPOLYGON (((13.77384 37...
10 Acer_platanoides MULTIPOLYGON (((46.89706 43...
We will download a polygon of the European continent at medium scale, crop it by a defined extent, and finally combine all countries into a unique polygon.
Let’s see how it looks
Before working with this map, we need to project the layer.
Earth is not flat :) Projections help us represent the two-dimensional curved surface of the globe into 2D space. There are many ways to do this. Find here a cool Projection Wizard.
Equal-area maps preserve area measure, generally distorting shapes in order to do that
We will choose ETRS89-extended / LAEA Europe (EPSG:3035), which uses Lambert Azimuthal Equal Area projection.
Let’s transform the data (both the world and the trees’ layers). We can use the code EPSG:3035.
Let’s double check that everything is alright
POLYGONS)We will do this using the function st_make_grid()
POLYGONS)For the cellsize argument we will chose 1 degree, which is ~100km (=100,000m)
We have the grid, we are ready to calculate metrics per grid cell. Let’s see how it looks
POLYGONS)Let’s plot 5 random species as an example
Now let’s use st_join() to calculate the number of species per grid-cell (SR).
Here are the results :)
Simple feature collection with 1445 features and 3 fields
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: 1080899 ymin: 1429727 xmax: 7535328 ymax: 6037921
Projected CRS: ETRS89-extended / LAEA Europe
# A tibble: 1,445 × 4
gridID N SR .
<int> <int> <int> <GEOMETRY [m]>
1 1 0 0 POINT (1080899 2579705)
2 2 0 0 MULTIPOLYGON (((1110023 2538982, 1100617 2543824, 1097498…
3 3 0 0 POLYGON ((1280899 2362386, 1278435 2364368, 1273591 23748…
4 4 0 0 POLYGON ((1202323 2535208, 1209886 2532775, 1219309 25259…
5 5 0 0 POLYGON ((1298076 2253769, 1296932 2252054, 1292751 22550…
6 6 0 0 POLYGON ((1280899 2381543, 1281344 2381481, 1288628 23688…
7 7 0 0 POLYGON ((1809887 1531661, 1821808 1530276, 1833394 15180…
8 8 13 13 POLYGON ((2680899 1833872, 2668639 1826794, 2670835 18324…
9 9 14 14 POLYGON ((2680899 2007077, 2650401 1989469, 2652179 19948…
10 10 14 14 POLYGON ((2680899 1833872, 2730899 1805004, 2730899 17472…
# ℹ 1,435 more rows
POLYGONS)POLYGONS)
We are done! No it’s your turn :)
Photo of Capreolus capreolus observed in Czech Republic by romanvrbicek licensed CC-BY-NC, via iNaturalist
Steps:
POINTS)POLYGONS)You have 30 minutes to work on this. After that, we will code it together :)
We’ll use the data from last practical (downloaded from GBIF), but his time, we will do a data download using occ_download(). In this way, we can download unlimited records and get a DOI for our dataset (very important for citation).
Check out how it looks like in GBIF, doi: 10.15468/dl.a9fytb.
See download_mammalsCZ_data_from_GBIF.R to see how the data were downloaded.
Load the data (downloaded from GBIF)
# A tibble: 5,130 × 50
gbifID datasetKey occurrenceID kingdom phylum class order family genus
<int64> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 4.56e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Lago… Lepor… Lepus
2 4.56e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Prim… Homin… Homo
3 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Cerv…
4 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Capr…
5 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Capr…
6 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Capr…
7 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Capr…
8 4.55e-315 6ac3f774-d9fb… "" Animal… Chord… Mamm… Arti… Cervi… Capr…
9 4.54e-315 22a66350-7947… "urn:catalo… Animal… Chord… Mamm… Carn… Muste… Mart…
10 2.45e-314 50c9509d-22c7… "https://ww… Animal… Chord… Mamm… Arti… Suidae Sus
# ℹ 5,120 more rows
# ℹ 41 more variables: species <chr>, infraspecificEpithet <chr>,
# taxonRank <chr>, scientificName <chr>, verbatimScientificName <chr>,
# verbatimScientificNameAuthorship <chr>, countryCode <chr>, locality <chr>,
# stateProvince <chr>, occurrenceStatus <chr>, individualCount <int>,
# publishingOrgKey <chr>, decimalLatitude <dbl>, decimalLongitude <dbl>,
# coordinateUncertaintyInMeters <dbl>, coordinatePrecision <dbl>, …
Let’s keep only a few fields
We will transform our data (table) into an sf object using st_as_sf()
Simple feature collection with 5130 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 12.203 ymin: 48.63992 xmax: 18.79605 ymax: 50.99892
Geodetic CRS: WGS 84
# A tibble: 5,130 × 6
species order eventDate stateProvince countryCode geometry
* <chr> <chr> <chr> <chr> <chr> <POINT [°]>
1 Lepus eu… Lago… 2011-06-… "" CZ (13.82827 48.8938)
2 Homo sap… Prim… 2013-12-… "" CZ (14.43048 50.09059)
3 Cervus e… Arti… 2012-12-… "" CZ (13.80655 48.82647)
4 Capreolu… Arti… 2013-10-… "" CZ (12.35683 50.13959)
5 Capreolu… Arti… 2011-06-… "" CZ (13.82827 48.8938)
6 Capreolu… Arti… 2013-06-… "" CZ (14.32012 50.09922)
7 Capreolu… Arti… 2011-06-… "" CZ (13.65935 48.99881)
8 Capreolu… Arti… 2012-12-… "" CZ (13.80655 48.82647)
9 Martes m… Carn… 1961-04-… "" CZ (18.28333 49.83333)
10 Sus scro… Arti… 2024-09-… "Středočeský" CZ (13.84809 50.23607)
# ℹ 5,120 more rows
And we save
Note that crs=4326 is related to the EPSG:4326, CRS: WGS 84
First, let’s load the country’s border using st_read
Simple feature collection with 1 feature and 70 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 12.08586 ymin: 48.54084 xmax: 18.86253 ymax: 51.05438
Geodetic CRS: WGS 84
GADMID ISO NAME_ENGLI NAME_ISO NAME_FAO NAME_LOCAL
1 59 CZE Czech Republic CZECH REPUBLIC Czech Republic Ceská republika
NAME_OBSOL NAME_VARIA NAME_NONLA NAME_FRENC NAME_SPANI
1 Moravia|Bohemia Cesko <NA> République Tchèque República Checa
NAME_RUSSI NAME_ARABI NAME_CHINE WASPARTOF CONTAINS
1 ????? ????????? ???????? ????? Czechoslovakia <NA>
SOVEREIGN ISO2 WWW FIPS ISON VALIDFR VALIDTO AndyID POP2000 SQKM
1 Czech Republic CZ <NA> EZ 203 19930101 Present 65 10271830 78495.16
POPSQKM UNREGION1 UNREGION2 DEVELOPING CIS Transition OECD
1 130.8594 Eastern Europe Europe 2 0 0 1
WBREGION WBINCOME WBDEBT WBOTHER CEEAC CEMAC
1 Europe & Central Asia Upper middle income Less indebted <NA> 0 0
CEPLG COMESA EAC ECOWAS IGAD IOC MRU SACU UEMOA UMA PALOP PARTA CACM EurAsEC
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Agadir SAARC ASEAN NAFTA GCC CSN CARICOM EU CAN ACP Landlocked AOSIS SIDS
1 0 0 0 0 0 0 0 1 0 0 1 0 0
Islands LDC Shape_Leng Shape_Area geom
1 0 0 26.21557 9.813959 MULTIPOLYGON (((13.17648 49...
We will actually load a Czech standard grid layer of 100 km2 area
But you can also create this grid by using st_make_grid() as we did before.
Simple feature collection with 678 features and 10 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -912065.7 ymin: -1229672 xmax: -421278.2 ymax: -928997.8
Projected CRS: S-JTSK / Krovak East North
First 10 features:
OBJECTID ENTITY AREA PERIMETE KVADRAT X Y Shape_Leng
1 1 3 130.032 45.626 4951 -740241.8 -935317.5 45626.41
2 2 5 130.036 45.627 4952 -728665.0 -936923.5 45627.17
3 3 6 130.303 45.675 5051 -741782.9 -946335.6 45675.21
4 4 8 130.040 45.628 4953 -717084.4 -938503.6 45627.96
5 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
6 6 12 130.311 45.677 5053 -718576.3 -949528.8 45676.59
7 7 14 130.050 45.630 4955 -693912.1 -941586.4 45629.61
8 8 17 130.055 45.630 4956 -682320.5 -943089.1 45630.48
9 9 18 130.319 45.678 5055 -695354.9 -952618.5 45678.09
10 10 19 130.060 45.631 4957 -670725.5 -944566.0 45631.36
Shape_Area ID geom
1 130031546 4951 POLYGON ((-745252.4 -928997...
2 130035855 4952 POLYGON ((-733689.7 -930614...
3 130302745 5051 POLYGON ((-746805.7 -940014...
4 130040349 4953 POLYGON ((-722123.3 -932206...
5 130306551 5052 POLYGON ((-735218.5 -941635...
6 130310574 5053 POLYGON ((-723627.4 -943229...
7 130049749 4955 POLYGON ((-698979.2 -935311...
8 130054715 4956 POLYGON ((-687401.8 -936825...
9 130319128 5055 POLYGON ((-700434.2 -946342...
10 130059749 4957 POLYGON ((-675820.7 -938313...
Check the layer’s Coordinate Reference System (CRS)
Check the layer’s Coordinate Reference System (CRS)
Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
Coordinate Reference System:
User input: S-JTSK / Krovak East North
wkt:
PROJCRS["S-JTSK / Krovak East North",
BASEGEOGCRS["S-JTSK",
DATUM["System of the Unified Trigonometrical Cadastral Network",
ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4156]],
CONVERSION["Krovak East North (Greenwich)",
METHOD["Krovak (North Orientated)",
ID["EPSG",1041]],
PARAMETER["Latitude of projection centre",49.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8811]],
PARAMETER["Longitude of origin",24.8333333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8833]],
PARAMETER["Co-latitude of cone axis",30.2881397527778,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",1036]],
PARAMETER["Latitude of pseudo standard parallel",78.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8818]],
PARAMETER["Scale factor on pseudo standard parallel",0.9999,
SCALEUNIT["unity",1],
ID["EPSG",8819]],
PARAMETER["False easting",0,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["GIS."],
AREA["Czechia; Slovakia."],
BBOX[47.73,12.09,51.06,22.56]],
ID["EPSG",5514]]
Frist, we will transform the layer’s CRS to S-JTSK / Krovak East North using st_transform().
Simple feature collection with 5130 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -897040.7 ymin: -1224642 xmax: -435147.4 ymax: -943016.8
Projected CRS: S-JTSK / Krovak East North
# A tibble: 5,130 × 6
species order eventDate stateProvince countryCode geometry
* <chr> <chr> <chr> <chr> <chr> <POINT [m]>
1 Lepus eu… Lago… 2011-06-… "" CZ (-803959.7 -1168429)
2 Homo sap… Prim… 2013-12-… "" CZ (-742108.4 -1042758)
3 Cervus e… Arti… 2012-12-… "" CZ (-806616.4 -1175608)
4 Capreolu… Arti… 2013-10-… "" CZ (-887913.2 -1015134)
5 Capreolu… Arti… 2011-06-… "" CZ (-803959.7 -1168429)
6 Capreolu… Arti… 2013-06-… "" CZ (-749798.3 -1040726)
7 Capreolu… Arti… 2011-06-… "" CZ (-814507 -1155077)
8 Capreolu… Arti… 2012-12-… "" CZ (-806616.4 -1175608)
9 Martes m… Carn… 1961-04-… "" CZ (-470570.3 -1101814)
10 Sus scro… Arti… 2024-09-… "Středočeský" CZ (-781040.9 -1020909)
# ℹ 5,120 more rows
Let’s plot the sf objects
To calculate number of records and number of species per grid cell, we will use st_join()
To calculate number of records and number of species per grid cell, we will use st_join()
Simple feature collection with 5341 features and 15 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -912065.7 ymin: -1229672 xmax: -421278.2 ymax: -928997.8
Projected CRS: S-JTSK / Krovak East North
First 10 features:
OBJECTID ENTITY AREA PERIMETE KVADRAT X Y Shape_Leng
1 1 3 130.032 45.626 4951 -740241.8 -935317.5 45626.41
2 2 5 130.036 45.627 4952 -728665.0 -936923.5 45627.17
3 3 6 130.303 45.675 5051 -741782.9 -946335.6 45675.21
4 4 8 130.040 45.628 4953 -717084.4 -938503.6 45627.96
5 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
5.1 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
5.2 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
5.3 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
5.4 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
5.5 5 9 130.307 45.676 5052 -730181.5 -947945.1 45675.88
Shape_Area ID species order eventDate stateProvince
1 130031546 4951 <NA> <NA> <NA> <NA>
2 130035855 4952 <NA> <NA> <NA> <NA>
3 130302745 5051 <NA> <NA> <NA> <NA>
4 130040349 4953 <NA> <NA> <NA> <NA>
5 130306551 5052 Lepus europaeus Lagomorpha 2020-06-29 Ústecký kraj
5.1 130306551 5052 Capreolus capreolus Artiodactyla 2020-07-03 Ústecký kraj
5.2 130306551 5052 Capreolus capreolus Artiodactyla 2020-07-09 Ústecký kraj
5.3 130306551 5052 Capreolus capreolus Artiodactyla 2020-06-20 Ústecký kraj
5.4 130306551 5052 Capreolus capreolus Artiodactyla 2020-06-23 Ústecký kraj
5.5 130306551 5052 Lepus europaeus Lagomorpha 2020-07-13 Ústecký kraj
countryCode geom
1 <NA> POLYGON ((-745252.4 -928997...
2 <NA> POLYGON ((-733689.7 -930614...
3 <NA> POLYGON ((-746805.7 -940014...
4 <NA> POLYGON ((-722123.3 -932206...
5 CZ POLYGON ((-735218.5 -941635...
5.1 CZ POLYGON ((-735218.5 -941635...
5.2 CZ POLYGON ((-735218.5 -941635...
5.3 CZ POLYGON ((-735218.5 -941635...
5.4 CZ POLYGON ((-735218.5 -941635...
5.5 CZ POLYGON ((-735218.5 -941635...
Now, we need to summarise the data per grid-cell.
Luckily, tidyverse methods also work for sf objects :)
We will do it with group_by() and summarise().
First, we will group by grid-cells and then we count values per grid.
Let’s calculate the number of records (N) and the number of species per grid-cell (SR)
st_join(CZ_grids, mammalsCZ_sf) %>%
group_by(OBJECTID) %>% # the name of the column that has the index
summarise(N=sum(!is.na(species)), # calculates the number of points in the polygon
SR=n_distinct(species, na.rm = TRUE)) # calculates the number of different 'species' in the polygonSimple feature collection with 678 features and 3 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -912065.7 ymin: -1229672 xmax: -421278.2 ymax: -928997.8
Projected CRS: S-JTSK / Krovak East North
# A tibble: 678 × 4
OBJECTID N SR geom
<int> <int> <int> <POLYGON [m]>
1 1 0 0 ((-745252.4 -928997.8, -733689.7 -930614.9, -733689.7 -…
2 2 0 0 ((-733689.7 -930614.9, -722123.3 -932206.2, -722123.3 -…
3 3 0 0 ((-746805.7 -940014.3, -735218.5 -941635, -735218.5 -94…
4 4 0 0 ((-722123.3 -932206.2, -710553.1 -933771.6, -710553.1 -…
5 5 27 3 ((-735218.5 -941635, -723627.4 -943229.9, -723627.4 -94…
6 6 6 3 ((-723627.4 -943229.9, -712032.7 -944798.9, -712032.7 -…
7 7 0 0 ((-698979.2 -935311.3, -687401.8 -936825.2, -687401.8 -…
8 8 0 0 ((-687401.8 -936825.2, -675820.7 -938313.3, -675820.7 -…
9 9 0 0 ((-700434.2 -946342, -688832.2 -947859.3, -688832.2 -94…
10 10 0 0 ((-675820.7 -938313.3, -664236.1 -939775.7, -664236.1 -…
# ℹ 668 more rows
Let’s store the output into a new object CZ_mammals_grids.
Finally, let’s plot this.
We will do it using geom_sf(), a ggplot function to visualise sf objects
Where are the nice colours?
We need to indicate which column from the object should the grids be filled with.
Let’s get a nicer color scale.
Bear in mind that we see colors differently. Thus, it’s important to consider colorblind safe palettes
Check out The R Graph Gallery for many cool graphic options with ggplot
ggplot() +
geom_sf(data=mammalsCZ_grids, aes(fill=SR)) +
scale_fill_fermenter(palette ='YlOrBr', n.breaks=9, direction = 1) + # fill of the grids
geom_sf(data=CZ_borders, fill=NA) +
coord_sf(crs=4326) +
theme_bw()Now, here’s a better plot :)
The hotspots of species richness are in cities. How can these be the richest areas for mammals?
Let’s have a look at the sampling effort (N: number of records per grid cell) and compare both layers
What can you say about the hotspots of species richness we found?
In following steps (e.g., modelling) you should take into account this bias :)
POINTS)POLYGONS)